Introduction

R Session Info

## R version 3.3.2 (2016-10-31)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
## Running under: OS X Yosemite 10.10.5
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] backports_1.0.4 magrittr_1.5    rprojroot_1.1   tools_3.3.2    
##  [5] htmltools_0.3.5 yaml_2.1.14     Rcpp_0.12.10    stringi_1.1.2  
##  [9] rmarkdown_1.3   knitr_1.15.1    stringr_1.1.0   digest_0.6.11  
## [13] evaluate_0.10

Description

This notebook documents the education attainment levels and age-adjusted death rates (AADR) for different causes of death across the United States. The data has been organized to explore the relationship between education attainment and AAMR between 1999 and 2013.

Data

How to download dataset from Data.World:

  1. Log into your account on https://data.world/
  2. Under the Datasets tab, go to “S17 eDV Project 6”
  3. Scroll down until you reach “Death.csv”
  4. Click on download
  5. Save the file under a folder accessible by Tableau

ETL Script

## Loading required package: readr
## Loading required package: plyr
## Loading required package: dplyr
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:plyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## Loading required package: data.world
## 
## Attaching package: 'data.world'
## The following object is masked from 'package:dplyr':
## 
##     query
## Loading required package: DT
## Parsed with column specification:
## cols(
##   YEAR = col_integer(),
##   `113_CAUSE_NAME` = col_character(),
##   CAUSE_NAME = col_character(),
##   STATE = col_character(),
##   DEATHS = col_character(),
##   AADR = col_character()
## )
## Classes 'tbl_df', 'tbl' and 'data.frame':    13261 obs. of  19 variables:
##  $ State      : chr  "AL" "AL" "AL" "AL" ...
##  $ AreaName   : chr  "Alabama" "Alabama" "Alabama" "Alabama" ...
##  $ edu.total  : int  3239351 3239351 3239351 3239351 3239351 3239351 3239351 3239351 3239351 3239351 ...
##  $ edu.males  : int  1532613 1532613 1532613 1532613 1532613 1532613 1532613 1532613 1532613 1532613 ...
##  $ edu.females: int  1706738 1706738 1706738 1706738 1706738 1706738 1706738 1706738 1706738 1706738 ...
##  $ m_no_school: int  21478 21478 21478 21478 21478 21478 21478 21478 21478 21478 ...
##  $ m_hs       : int  490120 490120 490120 490120 490120 490120 490120 490120 490120 490120 ...
##  $ m_bs       : int  226204 226204 226204 226204 226204 226204 226204 226204 226204 226204 ...
##  $ m_ms       : int  82686 82686 82686 82686 82686 82686 82686 82686 82686 82686 ...
##  $ m_phd      : int  19299 19299 19299 19299 19299 19299 19299 19299 19299 19299 ...
##  $ f_no_school: int  20398 20398 20398 20398 20398 20398 20398 20398 20398 20398 ...
##  $ f_hs       : int  515175 515175 515175 515175 515175 515175 515175 515175 515175 515175 ...
##  $ f_bs       : int  252608 252608 252608 252608 252608 252608 252608 252608 252608 252608 ...
##  $ f_ms       : int  119311 119311 119311 119311 119311 119311 119311 119311 119311 119311 ...
##  $ f_phd      : int  13033 13033 13033 13033 13033 13033 13033 13033 13033 13033 ...
##  $ year       : int  1999 1999 1999 1999 1999 1999 1999 1999 1999 1999 ...
##  $ cause      : chr  "Unintentional Injuries" "All Causes" "Alzheimer's disease" "Homicide" ...
##  $ amt_death  : chr  "2313" "44806" "772" "438" ...
##  $ AADR       : chr  "52.17" "1009.30" "17.80" "9.87" ...
## Warning in `[<-.factor`(`*tmp*`, is.na(x), value = ""): invalid factor
## level, NA generated

## Warning in `[<-.factor`(`*tmp*`, is.na(x), value = ""): invalid factor
## level, NA generated

## Warning in `[<-.factor`(`*tmp*`, is.na(x), value = ""): invalid factor
## level, NA generated
## [1] "edu.total"
## Warning in `[<-.factor`(`*tmp*`, is.na(x), value = 0): invalid factor
## level, NA generated
## [1] "edu.males"
## Warning in `[<-.factor`(`*tmp*`, is.na(x), value = 0): invalid factor
## level, NA generated
## [1] "edu.females"
## Warning in `[<-.factor`(`*tmp*`, is.na(x), value = 0): invalid factor
## level, NA generated
## [1] "m_no_school"
## Warning in `[<-.factor`(`*tmp*`, is.na(x), value = 0): invalid factor
## level, NA generated
## [1] "m_hs"
## Warning in `[<-.factor`(`*tmp*`, is.na(x), value = 0): invalid factor
## level, NA generated
## [1] "m_bs"
## Warning in `[<-.factor`(`*tmp*`, is.na(x), value = 0): invalid factor
## level, NA generated
## [1] "m_ms"
## Warning in `[<-.factor`(`*tmp*`, is.na(x), value = 0): invalid factor
## level, NA generated
## [1] "m_phd"
## Warning in `[<-.factor`(`*tmp*`, is.na(x), value = 0): invalid factor
## level, NA generated
## [1] "f_no_school"
## Warning in `[<-.factor`(`*tmp*`, is.na(x), value = 0): invalid factor
## level, NA generated
## [1] "f_hs"
## Warning in `[<-.factor`(`*tmp*`, is.na(x), value = 0): invalid factor
## level, NA generated
## [1] "f_bs"
## Warning in `[<-.factor`(`*tmp*`, is.na(x), value = 0): invalid factor
## level, NA generated
## [1] "f_ms"
## Warning in `[<-.factor`(`*tmp*`, is.na(x), value = 0): invalid factor
## level, NA generated
## [1] "f_phd"
## Warning in `[<-.factor`(`*tmp*`, is.na(x), value = 0): invalid factor
## level, NA generated
## [1] "year"
## Warning in `[<-.factor`(`*tmp*`, is.na(x), value = 0): invalid factor
## level, NA generated
## [1] "amt_death"
## Warning in `[<-.factor`(`*tmp*`, is.na(x), value = 0): invalid factor
## level, NA generated
## [1] "AADR"
## Warning in `[<-.factor`(`*tmp*`, is.na(x), value = 0): invalid factor
## level, NA generated

Visualizations

Tableau

Barchart - AADR by State

Barchart displays the sum of the AADR per cause across the United States. Heart Disease and Cancer have consistently higher sums of AADR than the average AADR for all causes.

Barchart - AADR by Year

Barchart displays the sum of the AADR per cause from 1999-2013. The average sum of AADR for all causes do not vary greatly per year. Heart Disease, Cancer, and Stroke have consistently higher sums of AADR than the average AADR for all causes.

Bachelor’s Degree Attainment: Female

High Level

Barchart displays females who’ve attained a Bachelor’s Degree. The page only shows the set of states that hosted high levels of female BS attainment (0.15-0.25). Compared to males, females also had a greater amount of states with high levels.

Low Level

Barchart displays females who’ve attained a Bachelor’s Degree. The page only shows the set of states that hosted low levels of female BS attainment (<0.15).

Bachelor’s Degree Attainment: Male

High Level

Barchart displays males who’ve attained a Bachelor’s Degree. The page only shows the set of states that hosted high levels of male BS attainment (0.15-0.25).

Low Level

Barchart displays males who’ve attained a Bachelor’s Degree. The page only shows the set of states that hosted low levels of male BS attainment (<0.15).

Shiny

Barchart - AADR by State

Barchart displays the sum of the AADR per cause across the United States. Heart Disease and Cancer have consistently higher sums of AADR than the average AADR for all causes.

Barchart - AADR by Year

Barchart displays the sum of the AADR per cause from 1999-2013. The average sum of AADR for all causes do not vary greatly per year. Heart Disease, Cancer, and Stroke have consistently higher sums of AADR than the average AADR for all causes.

Bachelor’s Degree Attainment: Female

High Level

Barchart displays females who’ve attained a Bachelor’s Degree. The page only shows the set of states that hosted high levels of female BS attainment (0.15-0.25). Compared to males, females also had a greater amount of states with high levels.

Low Level

Barchart displays females who’ve attained a Bachelor’s Degree. The page only shows the set of states that hosted low levels of female BS attainment (<0.15).

Bachelor’s Degree Attainment: Male

High Level

Barchart displays males who’ve attained a Bachelor’s Degree. The page only shows the set of states that hosted high levels of male BS attainment (0.15-0.25).

Low Level

Barchart displays males who’ve attained a Bachelor’s Degree. The page only shows the set of states that hosted low levels of male BS attainment (<0.15).